CONFIDENCE  LIMITS  FOR  THE  EXPECTED  VALUE  OF  AN  ARBITRARY 
BOUNDED  RANDOM  VARIABLE  WITH  A  CONTKRJOUS  DISTRIBUTION  FUNCTION 


BY 

T.  W.  ANDERSON 

TECHNICAL  REPORT  NO.  1 

OCTOBER  1,  1969 

PREPARED  UNDER  THE  AUSPICES 
OF 

OFFICE  OF  NAVAL  RESEARCH  CONTRACT  #N00014-67-A-0112-0030 

.-DEPARTMENT  OF  STATISTICS 
STANFORD  UNIVERSITY 


STANFORD,  CALIFORNIfv 


1, 


Introduction. 


Confsider  a  random  variable  X  with  a  continuous  cumulative  dis¬ 
tribution  function  F(s:)  such  that  F(a)  =  0  and  F(b)  =1  for  known 
finite  numbers  a  and  b  (a  <  b)  ,  The  distribution  function  F(x)  is 
unknoT-m.  A  sample  of  size,  n  is  drawn  from  this  dlstrl.bution.  Confi¬ 
dence  limits  for  the  expected  value  Cx  are  to  be  found  that  hold  for 
all  continuous  distribution  functions  with  range  [tx,  bj 


^ •  Confidence  limits  for  the  mean. 

I®"  ^  "tx”  ^  be  the  ordered  observations  in  the 

sample  or  n  frora  F(x)  ,  and  let  x^^^  =  a  and  =  b  .  The 


empirical  cumulative  distribution  function  in  [a,  'Q 


xs 


(1) 


F  =  j/n  , 


n 


x 


<  X  <  , 


J  ^>^9  9^- 


=0  for  X  <  a  and  F^(x)  =  1  for 


>  h  u 


Let  B  and  y  be  nuimbers  (depending  on  n)  such  that  the  probabilit 


ity  or 


(2) 


F  (x)  -  g  <  F(x)  <  F  (x)  Y 
n  -*  —  u 


all 


is  1  ~  a  5  the  desired  confidence  level*  Note  that  F  (x)  -  S  <  0  for 

n  — 

s:  5  t/here  r  =  >  the  largest  integer  in  nB  ,  and 

1  £  y  ^  »  where  s  =  ^ny}  ,  Since  0  £  F(x)  £  1 
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the  left-hand  inequality  in  (2)  is  effective  only  fot  <  x  <  b 

and  is  replaced  by  0  for  a  <  x  <  ;  the  right-hand  inequality  is 

effective  only  for  a  5  x  <  and  Is  replaced  by  1  for 

^<n-s)  ^  i  ^  Inequalities  (2)  over  these  ranges  are  equivalent  to 


(3) 


i 

n 


e  < 


F(x^^^) 


j  »  r  +  1, 


n, 


±± 

n 


+  Y.  j  «  1* 


,n-s. 


The  distribution  satisfying  the  first  part  of  (3)  for  given  ...  ,  X  ’ 

which  has  the  largest  mean  is  the  distribution  which  has  a  jump  of 
(r+l)/n  -  3  at  jumps  of  1/n  at  j  =  r+2,  . . .  ,  n  ,  and 

a  jump  of  B  at  b  .  This  fact  leads  to  the  Inequality 


(4) 


[(r+Dx^’^^^  +  y~  x^^^ 

+  e 

t  j=r+2  J 

Similarly  the  distribution  satisfying  the  second  part  of  (3)  which  has  the 
smallest  mean  is  the  distribution  which  has  a  jump  of  y  at  a  ,  jumps 
of  1/n  at  x^  jj®lj  .00  9  n— s— 1^  and  a  jump  of  (s*hl^/  n  *•  y  at  x^^  i 

this  fact  leads  to  the  inequality 


(5) 


-  y 

(n-s) 

X  -  a 

1  y»i  j 

1.  , 

<€x 


The  Inequalities  (4)  and  (5)  hold  simultaneously  with  probability  1  -  a 
and  these  furnish  the  desired  confidence  limits.  Tlie  distributions 
yielding  the  upper  and  lower  bounds  for  are  the  lower  and  upper  bounds 

to  the  distribution  of  X  .  If  we  use  integration  by  parts. 
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F(3c)ds:  ; 


<6) 


= 


KdF(x)  »  b  - 


/ 


b 


the  bounds  for  LX  can  be  verified  by  integrating  the  bounds  for  P(v:)  . 

The  inequalities  (2)  constitute  confidence  limits  for  the  ctnaulat.'ve 
distribution  function.  Values  of  $  and  y  specified  values  of  n 

and  1  -  a  have  been  given  for  0  ~  y  and  for  6  or  y  equal  to  1  , 
making  the  corresponding  ineqtiality  vacuous.  These  are  the  significance 
points  of  the  two-sided  and  one-sided  Kolasogorov  tests;  as37mptotic  and 
other  approximations  are  available »  as  well  as  tables. 

If  $  is  an  integer  divided  by  n  ,  namely  r/n  ,  the  Inequality 

(4)  is 


the  upper  confidence  limit  is  the  mean  of  the  sample. 


X  =  ^ 


n 


j=l 


with  the  r  smallest  observations  replaced  by  the  upper  bound.  If  y  =  s/n, 
(5)  is 


(8) 

1 

n 

J>  "  sa 

.  3-1  J 

J=n-s+i 

<€X  ; 


the  loser  coatldenoe  llnlt  le  the  =aan  of  the  sample  -elth  the  a  larsest 
observations  replaced  by  the  lov;er  bound. 


This  development  suggests  that  it  is 
bound  to  the  range  of  the  random  variable 


necessary  to  assume  ati  upper 
in  order  to  obtain  an  upper 


confidence  limit  for  the  mean  and  to  assume  a  lower  hound  to  the  range  to 
obtain  a  lower  confidence  limit.  Indeed,  in  order  that  the  mean  exist 
conditions  on  the  tails  of  the  distribution  are  needed,  but  one  cannot 
verify  these  conditions  with  a  positive  probability  on  the  basis  of  a 
finite  number  of  observations.  Of  course,  the  need  for  these  bounds 
limits  the  applicability  of  the  procedure. 


3*  Confidence  limits  for  other  parameters. 

Let  g(x)  be  a  monotonlcally  (strictly)  increasing  function  over 
the  interval  |^a,bj  ,  Then  the  distribution  satisfying  (2)  which  has 
the  largest  ^g(X)  is  that  which  has  the  largest  £x  ,  and  correspondingly 
the  distribution  satisfying  (2)  ^.Tith  the  aroallest  gg(X)  is  that  with 
the  smallest  gx  .  The  restiiting  inequalities  are 


(9)  €g(X)<i  I  4-8 

L  i»r+2  J 


g(b)  - 


(10) 


n-s-1 


n 


-  Y  -  g(a) 


<^g(X)  . 


The  Inequalities  (9)  and  (10)  ^rill  hold  simultaneously  for  all  monotonlcally 
(strictly)  increasing  functions  g(x)  .  (The  inequality  (9),  for  instance, 
will  hold  for  b  »  if  g(b)  is  bounded  as  b  **"  «>  ,) 

We  can  apply  (9)  and  (10)  to  find  confidence  limits  for  the 
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2  2  2 

variance  a  =  gx^  -  (‘^X)^  ^  *  Then  we  have  bounds  siraultaneousily 

2  2 

on  fX  and  gx  ,  aay  Lg  5_^^'  £  ^2  latter  is 

equivalent  to  £  (CX)^  £  .  Thus 

(11)  ~  1  1  ^2  ~  * 

4.  More  general  bounds. 

The  confidence  bounds  (2)  for  F(x)  can  be  generalized  to 


(12) 

F  (x)  -  3.  < 

F(x)  , 

j  “0,  1,  ...  ,n  , 

(13) 

F(x)  £  F^(x) 

+  Y^  3 

x«>  <x<x««>  , 

j  «  0,  1,  ...  n  , 

for  >  0  ,  and  Yj  >  0  ,  j  =  0,  1,  . . .  ,  n  c  ■  Wald  and  Wolfo'sitz  (1939) 
have  given  expressions  for  the  probability  of  (12)  and  (13)  holding  si- 
loultaneously. 

If  j/n  -  3^  £  0  or  j/n  Y j  ^  1  »  the  corresponding  ineqt!ality 

(12)  or  (13)  is  vacuous.  For  convenience  we  shall  replace  each  such  value 
Bj  or  Yj  to  make  j/n  -  3^  =»  0  or  j/n  +  ®  1.  (In  particular 

3q  ®  0  and  “*  0  .)  The  sequences  of  values  must  satisfy  l/«5 

j  «=  1,  ...  i,  n  5  and  Yj.j^  -  Yj  1/ri  3  j  “  I9  •••  »  n  ,  in  order  to  have 
force  In  (12)  and  (13),  respectively.  The  inequalities  (12)  and  (13)  are 
equivalent  to 
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(14) 


j  = 


O  o  d  j  p 


j/n  -  +■  , 


The  cumulative  distribution,  function  s.atisfying  (12)  and  (13)  for 

given  ...  ,  vrith  the  largest  mean  puts  weight  1/n  +  (B^  ^  -  P^) 

(1) 

at  X  ■*  ,  j  =  1,  ...  ,  a  ,  and  at  b  .  'ITie  distribution  satisfying 

(12)  and  (13)  with  the  smallest  mean  puts  weight  at  a  and 

1/n  +  (Yj  ~  Yj  i)  at  x^^^  ,  j  =  is  ...  ,  n  .  The  resulting  inequalities 
J  J 

on  the  expected  value  are 


(15) 


Sx  <  X  -f  y  (B.  1 

“  i=l  ^ 


)x 


ju  R 


-t-  P 


(16)  X  +  Y^a  +  '£l  (Y..  -  Y.  i)  , 

.1=1  ^  ^ 


The  Inequalities  (4)  and  (5)  are  special  cases  of  (15)  and  (16). 

I 

If  B(y)  and  C(y)  are  monotonically  *r.ticre:>Bixig  iioiBctions  trom  0 


uO  1 

in  [o,:y 

j  =  0, 

X  J  »  •  ft  ^ 

(17) 

B  '  B^(x)  j  <  F(x)  <  C  [  F^^(x)  j 


all  X 


The  inequalities  (15)  and  (16)  on  gx  can  be  found  by  integration  of 
(17)  by  parts  as  in  (6).  The  form  (17)  mav  be  baleful  In  findinsr  HiT)i;,.;Lng 
probabilities.  J'?ee  'niittle  (1951).] 
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approach  r^ugg-itis 
usi'-ig  these  :j. 
±c  a  gi'-vca  cxiC;,  itc\u  shcn  J.c 
expected  length  of  ths  cci 
of  sccli  c<}iirar;ni.'c;:ric.  It 
given  para-estric  fcmi?  T1 
the  distirlbiitloxi  osmpls:’  i 
bvit  the  confidertCB  level  •; 
of  the  ^iiethod  for  this  ea;; 
reported  :ln  ano'zherc  paper 


3 t-o  a  nxubar  of  :Lnte?:anting  p3rob,lemi3  •  If*  one 
xequaiitieG  v;hen  the  fiistributicn  sampled  from 
i  one  choose  ^(7)  and  C(yr)  to  minimise  tlie 
litldc-nc;::  :Lr.te:?rval?  Hov’  dres  the  ei^pected  letigta 
•terval  coT'pare  vjith  an  intrrval  h-ased  on  a 
le  Ixitervais  presertea  here  are  also  valid  when 
:voi}\  has  positive  probability  at  some  points 


-  •;  -j, 

vj..'..'  L’^r 


gi:eats:c  than  that  stated ;  modifications 


.-are  being  J3t-Jtdied<.  l“an:ther  results  tvxII 
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